Mixed Language Query Disambiguation

نویسندگان

  • Pascale Fung
  • Xiaohuo Liu
  • Chi Shun Cheung
چکیده

We propose a mixed language query disambiguation approach by using co-occurrence information from monolingual data only. A mixed language query consists of words in a primary language and a secondary language. Our method translates the query into monolingual queries in either language. Two novel features for disambiguation, namely contextual word voting and 1-best contextual word, are introduced and compared to a baseline feature, the nearest neighbor. Average query translation accuracy for the two features are 81.37% and 83.72%, compared to the baseline accuracy

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Query Translation Disambiguation as Graph Partitioning

Resolving ambiguity in the process of query translation is crucial to cross-language information retrieval when only a bilingual dictionary is available. In this paper we propose a novel approach for query translation disambiguation, named “spectral query translation model”. The proposed approach views the problem of query translation disambiguation as a graph partitioning problem. For a given ...

متن کامل

Using co-occurrence tendencies to improve Cross-Language Information Retrieval

Query disambiguation is considered as one of the most important methods in improving the effectiveness of information retrieval. In the present paper, we focus on query terms disambiguation via, a combined statistical method both before and after translation, in order to avoid source language ambiguity as well as incorrect selection of target translations. By combining query expansion with dict...

متن کامل

Cross-Language Information Retrieval via Hybrid Combination of Query Expansion Techniques

This paper describes a new approach in Cross-Language Information Retrieval that combines query expansion techniques before and after query translation and disambiguation. Moreover, a new technique based on domain keywords extraction is proposed. Test results showed the effectiveness of the combined method.

متن کامل

Translation Probabilities in Cross-language Information Retrieval

Translation ambiguity is a major problem in dictionary-based cross-language information retrieval. To attack the problem, indirect disambiguation approaches, which do not explicitly resolve translation ambiguity, rely on query-structuring techniques such as a structured Boolean model and Pirkola’s method. Direct disambiguation approaches try to assign translation probabilities to translation eq...

متن کامل

Ambiguity of Queries and the Challenges for Query Language Detection

In this paper, a sample set of 510 simple searches from the TEL action log 2009 is analyzed for query content and query language. More than half of the queries are for named entities, which has consequences for query language disambiguation. A manual identification of query language finds that often a definite language cannot be determined, because many named entities are not translated. Proble...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999